Nonparametrically Learning Activation Functions in Deep Neural Nets
نویسندگان
چکیده
We provide a principled framework for nonparametrically learning activation functions in deep neural networks. Currently, state-of-the-art deep networks treat choice of activation function as a hyper-parameter before training. By allowing activation functions to be estimated as part of the training procedure, we expand the class of functions that each node in the network can learn. We also provide a theoretical justification for our choice of nonparametric activation functions and demonstrate that networks with our nonparametric activation functions generalize well. To demonstrate the power of our novel techniques, we test them on image recognition datasets and achieve up to a 15% relative increase in test performance compared to the baseline.
منابع مشابه
On Loss Functions for Deep Neural Networks in Classification
Deep neural networks are currently among the most commonly used classifiers. Despite easily achieving very good performance, one of the best selling points of these models is their modular design – one can conveniently adapt their architecture to specific needs, change connectivity patterns, attach specialised layers, experiment with a large amount of activation functions, normalisation schemes...
متن کاملUniversal Function Approximation by Deep Neural Nets with Bounded Width and ReLU Activations
This article concerns the expressive power of depth in neural nets with ReLU activations and bounded width. We are particularly interested in the following questions: what is the minimal width wmin(d) so that ReLU nets of width wmin(d) (and arbitrary depth) can approximate any continuous function on the unit cube [0, 1] aribitrarily well? For ReLU nets near this minimal width, what can one say ...
متن کاملKernel Methods for Deep Learning
We introduce a new family of positive-definite kernel functions that mimic the computation in large, multilayer neural nets. These kernel functions can be used in shallow architectures, such as support vector machines (SVMs), or in deep kernel-based architectures that we call multilayer kernel machines (MKMs). We evaluate SVMs and MKMs with these kernel functions on problems designed to illustr...
متن کاملSolving Fuzzy Equations Using Neural Nets with a New Learning Algorithm
Artificial neural networks have the advantages such as learning, adaptation, fault-tolerance, parallelism and generalization. This paper mainly intends to offer a novel method for finding a solution of a fuzzy equation that supposedly has a real solution. For this scope, we applied an architecture of fuzzy neural networks such that the corresponding connection weights are real numbers. The ...
متن کاملBounds for the Computational
It is shown that feedforward neural nets of constant depth with piecewise polynomial activation functions and arbitrary real weights can be simulated for boolean inputs and outputs by neural nets of a somewhat larger size and depth with heaviside gates and weights from f0; 1g. This provides the rst known upper bound for the computational power and VC-dimension of such neural nets. It is also sh...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016